Word Sense Annotation of Polysemous Words by Multiple Annotators

نویسندگان

  • Rebecca J. Passonneau
  • Ansaf Salleb-Aouissi
  • Vikas Bhardwaj
  • Nancy Ide
چکیده

We describe results of a word sense annotation task using WordNet, involving half a dozen well-trained annotators on ten polysemous words for three parts of speech. One hundred sentences for each word were annotated. Annotators had the same level of training and experience, but interannotator agreement (IA) varied across words. There was some effect of part of speech, with higher agreement on nouns and adjectives, but within the words for each part of speech there was wide variation. This variation in IA does not correlate with number of senses in the inventory, or the number of senses actually selected by annotators. In fact, IA was sometimes quite high for words with many senses. We claim that the IA variation is due to the word meanings, contexts of use, and individual differences among annotators. We find some correlation of IA with sense confusability as measured by a sense confusion threshhold (CT). Data mining for association rules on a flattened data representation indicating each annotator’s sense choices identifies outliers for some words, and systematic differences among pairs of annotators on others.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Making Sense of Word Sense Variation

We present a pilot study of word-sense annotation using multiple annotators, relatively polysemous words, and a heterogenous corpus. Annotators selected senses for words in context, using an annotation interface that presented WordNet senses. Interannotator agreement (IA) results show that annotators agree well or not, depending primarily on the individual words and their general usage properti...

متن کامل

Anveshan: A Framework for Analysis of Multiple Annotators' Labeling Behavior

Manual annotation of natural language to capture linguistic information is essential for NLP tasks involving supervised machine learning of semantic knowledge. Judgements of meaning can be more or less subjective, in which case instead of a single correct label, the labels assigned might vary among annotators based on the annotators’ knowledge, age, gender, intuitions, background, and so on. We...

متن کامل

Handling Subtle Sense Distinctions Through Wordnet Semantic Types

In this paper we challenge the question of whether there is value in having multiple layers of semantic information associated with corpus semantic annotation. In this context we introduce a semantic annotation experiment in which novice annotators were asked to assign sense tags to a set of polysemous corpus nouns, using Wordnet as their referential sense repository. Wordnet is a rich sense in...

متن کامل

Analysis Of A Hand-Tagging Task

We analyze the results of a semantic annotation task performed by novice taggers as part of the WordNet SemCor project (Landes et al., in press). Each polysemous content word in a text was matched to a sense from WordNet. Comparing the performance of the novice taggers with that of experienced lexicographers, we find that the degree of polysemy, part of speech, and the position within the WordN...

متن کامل

Multi-Modal Word Synset Induction

A word in natural language can be polysemous, having multiple meanings, as well as synonymous, meaning the same thing as other words. Word sense induction attempts to find the senses of polysemous words. Synonymy detection attempts to find when two words are interchangeable. We combine these tasks, first inducing word senses and then detecting similar senses to form word-sense synonym sets (syn...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010